A Clustering under Approximation Stability

نویسندگان

MARIA-FLORINA BALCAN

AVRIM BLUM

ANUPAM GUPTA

چکیده

A common approach to clustering data is to view data objects as points in a metric space, and then to optimize a natural distance-based objective such as the k-median, k-means, or min-sum score. For applications such as clustering proteins by function or clustering images by subject, the implicit hope in taking this approach is that the optimal solution for the chosen objective will closely match the desired “target” clustering (e.g., a correct clustering of proteins by function or of images by who is in them). However, most distance-based objectives, including those above, are NP-hard to optimize. So, this assumption by itself is not sufficient, assuming P 6= NP, to achieve clusterings of low-error via polynomial time algorithms. In this paper, we show that we can bypass this barrier if we slightly extend this assumption to ask that for some small constant c, not only the optimal solution, but also all c-approximations to the optimal solution, differ from the target on at most some fraction of points—we call this (c, )-approximation-stability. We show that under this condition, it is possible to efficiently obtain low-error clusterings even if the property holds only for values c for which the objective is known to be NP-hard to approximate. Specifically, for any constant c > 1, (c, )-approximation-stability of k-median or k-means objectives can be used to efficiently produce a clustering of error O( ) with respect to the target clustering, as can stability of the min-sum objective if the target clusters are sufficiently large. Thus, we can perform nearly as well in terms of agreement with the target clustering as if we could approximate these objectives to this NP-hard value.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Thesis Proposal: Approximation Algorithms and New Models for Clustering and Learning

This thesis concerns two fundamental problems in clustering and learning: (a) the k-median and the k-means clustering problems, and (b) the problem of learning under adversarial noise, also known as agnostic learning. For k-median and k-means clustering we design efficient algorithms which provide arbitrarily good approximation guarantees on a wide class of datasets. These are datasets which sa...

متن کامل

Function Approximation Approach for Robust Adaptive Control of Flexible joint Robots

This paper is concerned with the problem of designing a robust adaptive controller for flexible joint robots (FJR). Under the assumption of weak joint elasticity, FJR is firstly modeled and converted into singular perturbation form. The control law consists of a FAT-based adaptive control strategy and a simple correction term. The first term of the controller is used to stability of the slow dy...

متن کامل

Clustering under Local Stability: Bridging the Gap between Worst-Case and Beyond Worst-Case Analysis

Recently, there has been substantial interest in clustering research that takes a beyond worst-case approach to the analysis of algorithms. The typical idea is to design a clustering algorithm that outputs a near-optimal solution, provided the data satisfy a natural stability notion. For example, Bilu and Linial (2010) and Awasthi et al. (2012) presented algorithms that output near-optimal solu...

متن کامل

Application of Pattern Recognition Algorithms for Clustering Power System to Voltage Control Areas and Comparison of Their Results

Finding the collapse susceptible portion of a power system is one of the purposes of voltage stability analysis. This part which is a voltage control area is called the voltage weak area. Determining the weak area and adjecent voltage control areas has special importance in the improvement of voltage stability. Designing an on-line corrective control requires the voltage weak area to be determi...

متن کامل

Application of Pattern Recognition Algorithms for Clustering Power System to Voltage Control Areas and Comparison of Their Results

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

A Clustering under Approximation Stability

نویسندگان

چکیده

منابع مشابه

Thesis Proposal: Approximation Algorithms and New Models for Clustering and Learning

Function Approximation Approach for Robust Adaptive Control of Flexible joint Robots

Clustering under Local Stability: Bridging the Gap between Worst-Case and Beyond Worst-Case Analysis

Application of Pattern Recognition Algorithms for Clustering Power System to Voltage Control Areas and Comparison of Their Results

Application of Pattern Recognition Algorithms for Clustering Power System to Voltage Control Areas and Comparison of Their Results

عنوان ژورنال:

اشتراک گذاری